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Abstract 



In this technical report, we address the problem of quasi-Euclidean reconstruction (i.e., close 
to scaled Euclidean but not necessarily exact) using an uncalibrated camera with a specific known 
type of motion, namely unknown but complete orbital motion. By orbital motion, we mean pure 
rotation about a fixed arbitrary axis. Exact scaled Euclidean reconstruction for orbital motion is 
not possible because of the 2 degree-of- freedom ambiguity [22]. We bypass the usual intermediate 
stages of projective or affine reconstruction and recover 3-D structure directly from point corre- 
spondences obtained from a two-stage bidirectional tracking process. 3-D reconstruction is done 
by applying the iterative Levenberg-Marquardt algorithm to minimize error between actual point 
tracks and projected point tracks. We show that reconstruction from complete orbital motion is 
superior to that from partial orbital motion. This work also investigates the sensitivity of recovered 
quasi-Euclidean reconstruction, tilt, and object rotation to the actual camera tilt. 
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1 Introduction 

In this technical report, we address the problem of quasi-Euclidean reconstruction using an un- 
calibrated camera with the specific known type of motion, namely unknown but complete orbital 
motion. By orbital motion, we mean pure rotation about a fixed arbitrary axis. Exact scaled 
Euclidean reconstruction for orbital motion is not possible because of the 2 degree-of-freedom 
ambiguity [22]. We bypass the usual intermediate stages of projective or affine reconstruction and 
recover direct 3-D structure from point correspondences. The feature point correspondence are ob- 
tained from a two-stage bidirectional tracking process. 3-D reconstruction is done by applying the 
iterative Levenberg-Marquardt algorithm to minimize error between actual point tracks and pro- 
jected point tracks. Initialization is based on estimating the tilt by fitting elUpses on point tracks 
and assuming equal angular rotation between frames. Convergence is speeded up by adopting an 
object-centered representation. This work is an extension of [25] with the projection function hav- 
ing the option of using all five camera intrinsic parameters (focal length /, aspect ratio r, image 
skew a, and principal point (txo, ^^o))- We also demonstrate that results from complete orbital mo- 
tion are superior to those of partial orbital motion. This work also investigates the sensitivity of 
quasi-Euclidean reconstruction to the actual camera tilt during orbital motion. 

1.1 Prior work 

There is a large body of work done on 3-D reconstruction from images using an uncalibrated 
camera. One of the first steps taken prior to actual 3-D reconstruction is usually the process of 
camera calibration. Particularly germane to 3-D reconstruction using an uncalibrated camera is 
self-calibration. Self-calibration refers to recovery of camera parameters based only on corre- 
spondences of images taken at different poses. Most work done on self-calibration rely on known 
motions of the cameras, such as pure translational motion [2], known camera rotation angles [3], 
rotation about an unknown but fixed arbitrary axis [25], and pure rotation about the camera center 
[7]. 

The traditional approach to 3-D reconstruction with multiple images using an uncalibrated 
camera is to apply affine and projective reconstruction techniques (such as [4, 20, 6, 8]). The 
traditional approach to reconstruct scaled Euclidean structure is usually from known camera pa- 
rameters. For example, Szeliski [23] and Matsumoto et al. [14] recover Euclidean structure from 
object rotation (or equivalently, camera orbital motion) using the assumption that the camera pa- 
rameters and object motion are known. Niem and Wingbermiihle [16] use a grided annulus pattern 
inside which the object is placed. Camera parameters are extracted from the detected pattern, and 
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the object is recovered from silhouettes. Zhang [26] proposes a closed-form solution for scaled 
Euclidean reconstruction with known intrinsic camera parameters but unknown extrinsic camera 
parameters. However, this technique assumes the existence of four coplanar correspondences that 
are not coUinear. 

Recently, however, there has been interesting work done in reconstructing scaled Euclidean 
structure from images using an uncalibrated camera. In a work that is probably the closest to ours, 
Heyden and Astrom [9] propose a technique to reconstruct scaled Euclidean structure under con- 
stant but unknown intrinsic camera parameters. They showed that in general, it takes a minimum 
of 3 images to recover a unique solution to the intrinsic camera parameters and scaled EucUdean 
structure. This is done by considering the Kruppa constraints [13, 15]. Their technique of scaled 
Euclidean reconstruction is based on recovering an intermediate projective structure. They then 
use an optimization formulation that is based on the Frobenius norm of a matrix. However, this is 
not equivalent to the more optimal metric of minimizing feature location errors in the 2-D image 
space. In a later work, they also show that scaled Euclidean reconstruction under known image as- 
pect ratio and skew but varying and unknown focal length and principal point is also possible [10]. 
The assumption is that the camera is undergoing general motion, as it is not possible to reconstruct 
scaled Euclidean structure under constrained motion such as pure translational or orbital motion. 

A two-step approach is used to recover scaled Euclidean structure from multiple image se- 
quences with unknown but constant intrinsic parameters [18]. The first stage involves affine cam- 
era parameter recovery using the so-called modulus constraint. This is followed by conversion to 
scaled Euclidean structure. This approach is subsequently extended to remove the assumption of 
fixed camera focal length [17]. 

Hartley devises an algorithm for camera self-calibration from several views [5]. He uses a 
two-step approach to recover scaled Euclidean structure. His algorithm first recovers projective 
structure before applying a heuristic to search for extract the five intrinsic camera parameters. The 
heuristic involves iterating over several sets of initialization values and checking for convergence. 

A detailed characterization of critical motion sequences (CMS) is given by Sturm [22]. A 
critical motion sequence is a camera motion sequence that results in ambiguities in reconstruction 
when camera parameters are unknown. For example, only affine structures can be extracted from 
pure camera translational motion. Of particular relevance to our work is the Sturm's determination 
that there is a two degree of freedom projective ambiguity for orbital motions (i.e., pure rotation 
about a fixed arbitrary axis). 



1.2 Quasi-Euclidean reconstruction from unknownhut complete orbital motion 3 



1.2 Quasi-Euclidean reconstruction from unknown but complete orbital mo- 
tion 

As Sturm as demonstrated [22], there exists a 2 degree-of-freedom ambiguity in scaled Euclidean 
reconstruction. There are three options in recovering scaled Euclidean structure from orbital mo- 
tion: (1) fix two intrinsic camera parameters, (2) impose structural constraints (e.g., orthogonality, 
parallelism, known 3-D locations of fiducial points), or (3) get the "best" reconstruction without 
(1) or (2). We choose to option (3), though option (1) is another convenient option as well. In 
particular, for option (1), we can assume that the aspect ratio r to be 1.0 and the image skew a to 
be 0. In practice, these parameters are usually fixed for a camera and need to be determined only 
once. 

We recover 3-D structure and orbital motion from relatively sparse sequence of images (typ- 
ically between 30-50 images in a sequence). The unknowns are the camera intrinsic parameters, 
global camera tilt, local rotation axis, amount of local rotation between successive frames, and the 
3-D positions associated with the tracked point features. 

The assumptions that we have made for our work are the following: 

• The image sequence is a "closed loop" sequence (i.e., featuring complete object rotation), 

• The object surface has sufficient texture for interframe feature tracking, 

• The object rotation is about an unknown but fixed axis, and 

• The camera either does not have significant radial distortion, or the radial distortion factor is 
known so that the images can be corrected first. 

The advantages of using our approach of recovering structure and motion from complete object 
rotation (or equivalently, complete camera orbital motion): 

• The set-up is extremely simple and cheap (using a camera, tripod stand, and a turntable 
would do), 

• The global tilt angle, local rotation angles, and quasi-Euclidean structure can be extracted 
simultaneously, 

• Calibration of camera is not required, 

• Initialization of the system is simple due to the known type of motion. 
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• Recovery of the intermediate affine or projective structure is not necessary, and 

• The method exhibits fast convergence and is stable. 

1.3 Organization 

In Section 2, we outUne the general structure from motion problem and our approach to recov- 
ering the solution specifically in the special case of unknown but complete orbital motion. The 

bidirectional tracking scheme employed to recover the point tracks for a given image sequences is 
described in Section 3. One of the important parameter in orbital motion is the amount of camera 
tilt. Section 4 shows, through simulations, how sensitive recovered shape and camera parameters 
are to the actual camera tilt. The results of reconstruction using image sequences of rotated sim- 
ulated and real objects are given in Section 5. This section also illustrates the improvement in 
results due to the knowledge of complete camera orbital motion (or object rotation). The method 
and its results are discussed in Sections 6; finally, a summary of the method and its implications 
are presented in Section 7 respectively. 

2 General structure from motion 

The formulation of recovering structure from motion is based on that of [25]. Essentially, we are 
trying to recover a set of 3-D structure parameters Pj and time- varying motion parameters Tj from 
a set of observed image features Ujj. The general equation linking a 2D image feature location Ujj 
in frame j to its 3-D position pj (i is the track index) is 

Uij = V{Tj''\..Tj%i) (1) 

where the perspective projection transformation V{) is applied to a cascaded series of rigid trans- 
formation T^''\ Each transformation is in turn defined by 

rf = Rf + tf (2) 

where R^''^ is a rotation matrix and t^''^ is a translation appHed after the rotation. 

Within each of the cascaded transforms, the motion parameters may be time-varying (the j 
subscript is present) or fixed (the subscript is dropped). The transformation associated with the 
(horizontal) orbital motion that we are considering in our work is 
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2. 1 Least-squares minimization 



5 



where Tix,T, ^z,e represent rotation about the x-axis by r and z-axis by 9 respectively. We assume 
negUgible cyclotorsion (rotation of camera about the viewing axis). 
The general camera-centered perspective projection equation is 




y 



+ ^^0 



(4) 



where / is a product of the focal length of the camera and the pixel array scale factor, r is the 
image aspect ratio, a is the image skew, and [u^, v^) is the principal point. 

An alternative object-centered formulation (a more general version of [25]) which we use is 
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(5) 



Here, we assume that the {x, y, z) coordinates before projection are with respect to a reference 
frame that has been displaced away from the camera by a distance tz along the optical axis,^ with 
s = f jtz and T] = l/tz- The projection parameter s can be interpreted as a scale factor and 7] 
as a perspective distortion factor. Our alternative perspective formulation results in a more robust 
recovery of camera parameters under weak perspective, where ?7 ^ 1, and assuming {uq, vq) ^ 
(0, 0) and a ?ii 0, we have V{x, y, zY ~ {sx^ rsy)'^. This is because s and rs can be much more 
reliably recovered than t], in comparison with the old formulation where / and are very highly 
correlated. 



2.1 Least-squares minimization 

The Levenberg-Marquardt algorithm [19] is used to solve for the structure and motion parameters. 
The merit or objective function that we minimize is 

^(a) = E E (^ij - ' (6) 

i j 

where f () is given in (1) and 

aij = \Vi , mj , Hig j (7) 

'if we wish, we can view as the z component of the original global translation which is absorbed into the 
projection equation, and then set the third component of t to zero. 
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is the vector of structure and motion parameters which determine the image of point i in frame 
j. The vector a contains all of the unknown structure and motion parameters, including the 3-D 
points Pi, the time-dependent motion parameters nij, and the global motion/calibration parameters 
nig. The weight Cij in (6) describes our confidence in measurement Ujj, and is normally set to the 
inverse variance a^'^. Implementational details are given in [25]. The primary difference between 
their work and this work is that we incorporate the additional camera intrinsic parameters r, a, 
Uq, and vq. The extensions (calculating the required derivatives of the Hessian matrix and gradient 
vector) are relatively straightforward. 

2.2 Two-stage approach 

As Sturm has shown [22], there is a 2 degree-of-freedom ambiguity in scaled Euclidean reconstruc- 
tion for orbital motion. A simple solution to this is to set two of the intrinsic camera parameters 
to a constant (either as an assumption or through calibration). A good choice would be to fix the 
image skew factor a (say to 0) and the aspect ratio r (say to 1). However, in our work, we keep 
these parameters unknown. 

We use a two-step approach in acquiring the quasi-Euclidean structure: 

• Fix a = Uo = Vo = 0 and r = 1 and apply the Levenberg-Marquardt algorithm until ter- 
minating conditions are met. The termination conditions are either the number of iterations 
exceeds a threshold (150 in our case) or the improvement in the objective function is too 
small (10~^ in our case), whichever comes first. 

• Set a,uo,vo, and r to be free variables and resume with Levenberg-Marquardt algorithm 
(with the iteration number reset to 0) until the same termination conditions are again met. 

In general, using this two-step approach has resulted in better convergence towards lower image 
projection errors and better structure and motion recovery in comparison to not applying the first 
step. This observation is based on more than 100 runs that involve synthetic image sequences and 
about 6 runs involving real image sequences. 

2.3 Initialization 

The initialization of object motion for each frame is made simpler by the fact that we know that 
object motion is a rotation about a unique (but unknown) axis. If there are A-frame images in the 
sequence, then the local rotation angle about the z-axis associated with image j is initialized to 
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jr^. The camera tilt is estimated by fitting ellipses to the tracks. The estimated tilt used for 

-' 'frame 

initialization is usually within 15° of the actual camera tilt (based on simulation results). 

The scale factor s and perspective distortion factor t] are initialized to arbitrary values of 1 .0 and 
0.001 respectively. The algorithm does not appear to be too sensitive to these values, as changes 
up to about an order of magnitude is tolerated. 



3 Tracking 

To produce tracks in a circular sequence as input to the iterative least-squares minimizing proce- 
dure, we first mask out regions that do not change significantly in intensity through simple pairwise 
image subtraction. This is effective in removing the static background and low object texture areas. 
The tracking then proceeds by way of bidirectional tracking with three stages: 

1. Pairwise global spline-based registration [24], 

2. Automatic selection of spline node points with high local texture, 

3. Multiframe Shi-Tomasi local tracking [21] of these spline nodes. 

These three stages are necessary to incorporate the advantages of both spline-based registration and 
local tracking techniques. While the spline-based registration technique is able to track relatively 
significant motion, it is not able to deal with motion discontinuity very well due to its implicit 
smoothness constraint. On the other hand, the local tracking technique performs very poorly for 
significant motion but very well within the vicinity of the true track position. Because points 
are tracked independently for local tracking, motion discontinuities can be handled. Since the 
spline -based registration technique yields a reasonably good estimate of motion, the local tracking 
technique can then make use of this estimate to improve on the new track image position, especially 
if the position is within the vicinity of a motion discontinuity. 

In the first stage, image flow between successive frames in the circular sequence is computed 
in both directions using the spline -based registration technique. Then, in stage 2, for each frame, 
points with high local texture (indicated by the minimum eigenvalue of the local Hessian) are 
automatically chosen for tracking. Finally, at the third and final stage of tracking, for every frame, 
the chosen points are then individually locally tracked in both directions using the flow estimates 
from stage 1. The tracking stops (tracking in the two directions is independent) if any of the 
following conditions is violated: 

• Each trace continuously move in one direction. 
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• The RMS pixel difference error is less than a threshold (15 in our case), or 

• The minimum eigenvalue is above a threshold (500 in our case). 

The problem with the first criterion above is that a complete track that is observable in all the 
frames is not possible. However, the need to reject random noise is greater, and having a few more 
redundant points is a relatively small price to pay. The tracks are postprocessed to remove those 
that are deemed too short (we impose a minimum track length of 3). 



To ensure that the ground truth is available, experiments were conducted using a collection of 
synthetic 3-D points. The 3-D points were generated at random, and the only constraint on their 
location is that they have to be within a radius of 25 from its local center. The object local center is 
chosen to be a distance of 225 from the camera center. Each 3-D point is oriented, i.e., each point 
has a normal associated with it. This allows "visibility" to be determined based on the camera 
pose; a point pj with normal is visible under camera pose j if • Zj cam < 0. Zj cam is the 
jth unit camera viewing direction away from the camera. Tracking is not explicitly done here, 
since we can calculate the 2-D image coordinates directly based on known camera and object 
transformations. Structure from orbital motion is then performed on the artificial tracks, with 
different image (Gaussian) noise levels. 

The RMS error in recovered 3-D is calculated by finding 



and taking the square root of £. Appoints is the number of points, and pj,cor and pj the theoretically 
correct and recovered ith point respectively, m is the global scale, R the global rotation, and c 
the global translation; these parameters constitute the best rigid body transformation between the 

theoretically correct and recovered set of points. We implemented Horn's algorithm that uses a 
closed-form solution for this least-squares problem [11]. An alternative method that can be used is 
based on singular value decomposition (SVD) of a 3x3 matrix [1]. 

We ran experiments involving a variety of gaussian image noise, namely, 0.0, 0.1, 0.3, 0.5, 
1.0, and 2.0 pixels. As a reference, the projected 3-D points are distributed within an area of 
approximately 60x60 pixels. The results of the experiment are graphically shown in Figure 1. For 
this figure, note that the local rotation is constrained to be about the z-axis. To reiterate, as a point 
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of reference for the graph shown in Figure 1(a), the 3-D points are randomly distributed within a 
ball of radius of 25, and the center of this ball is a distance of 225 away from the center of the 
camera. The actual focal length is 324. 

As can be seen in Figure 1, the error in recovered shape and tilt generally increases with the 
camera tilt. There are two surprises: (1) the very gradual increase in error followed by a very sharp 
increase in error with increasing actual camera tilt, and (2) the error in the local rotation angle 
estimate does not seem to exhibit a consistent trend. The sharpness of the reciprocal trend in (1) is 
somewhat attenuated by noise, as shown in Figure 1(a). 

5 Results 

In this section, we present results using a synthetic object as well as several real objects. In addi- 
tion, we compare results obtained from complete rotation with those from the exact same sequence, 
but without the completeness of rotation assumption. The initialization is the same for both cases. 

5.1 Sequence of rotating synthetic cylindrical object 

In the previous synthetic examples, the tracks were computed directly from known camera param- 
eters. In this example, however, a textured cylinder was defined, rotated, and then rendered using 
Rayshade. (Rayshade is a program for producing ray-traced color images [12].) The sequence of 
images produced in this manner is then treated as a normal input sequence whose bidirectional 
point tracks are recovered and fed into the "structure from complete orbital motion" module. The 
camera tilt is maintained at 0°, and there are 40 frames in this sequence (two of which are shown 
in Figure 2(a) and (b). Two views of the reconstructed 3-D points are shown in Figure 2(c) and (d). 
The average (best-fit) cross-sectional radius of the recovered points is 0.823869, while the RMS 
error of the fit in radius is 0.063680. Since the actual radius of cylinder is 1.0, the rescaled RMS 
error of radius fit is 0.077293842. Because the distance of the cylinder from the camera is 8.0, the 
rescaled RMS error of radius fit of the recovered set of 3-D points constitute only about 1% of the 
camera distance. 

5.2 Sequences of rotating real objects 

In this section, we show results of applying the technique to four image sequences of real ob- 
jects, namely a dodecahedron (Figure 3), a film box (Figure 4), a toy frog (Figure 5), and a cube 
(Figure 6). 
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Figure 1: Graphs illustrating sensitivity of recovered (a) shape, (b) tilt angle, (c) local rotation 
angle, and (d) focal length, to the actual camera tilt during complete orbital motion. See text for a 
description of the conditions to which the experiments were subject. 



5.3 Complete rotation vs. incomplete rotation 
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(c) (d) 

Figure 2: Rotating synthetic cylinder (complete rotation, 40 frames): (a) 1st frame, (b) 5th frame, 
(c) top view, and (d) side view of recovered 3-D points (2291 points). 

As can be seen, the 3-D points for all these sequences have been recovered reasonably well. 
The film box was the most difficult sequence, since the surface of the box is highly specular. This 
resulted in noisier 3-D position estimates (Figure 4). In addition, it can be observed from Figures 3- 
5 that the point traces are not all correct. Outlier rejection was used in the Levenberg-Marquardt 
to remove points that may be wrongly tracked. 

5.3 Complete rotation vs. incomplete rotation 

Most work on structure from rotation do not use complete rotation. For the same number of frames, 
it is logical to deduce that with complete rotation, with a circular sequence where the first and last 
frames are adjacent, the reconstruction is better due to the extra constraints for feature points be- 
tween these frames and their vicinities. As an illustration. Figure 7 shows the result of applying 
normal structure from incomplete rotation. The errors in estimating the rotation angles are all 
biased in one direction due to the single "open" chain of constraints, causing the "pinched" ap- 
pearance. In comparison, the reconstructed 3-D points of the same object under complete rotation 
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(d) (e) 

Figure 3: Rotating film box (complete rotation, 31 frames): (a) 1st frame, (b) 5th frame, (c) point 
tracks, (d)-(e) top, and side views of recovered 3-D points (1414 points). 

(Figure 6) do not result in the same "pinched" appearance. 

Another example is shown in Figure 8. Figure 8(a) shows the top view of the reconstructed 3-D 
points for the case when complete rotation is assumed, while Figure 8(b) shows the same points 
for the case when complete rotation is not assumed. In this case, each track that spans across the 
first and last frames is first broken up into two shorter tracks. The longer of the two fragmented 
tracks is used while the other is discarded. Not surprisingly, the reconstructed 3-D points for the 
complete rotation case is better. There are 416 points on the cube, and the camera tilt is 45° for 
both cases. The tracks were synthetically generated with no noise, and there were 45 frames with 
equiangularly spaced object rotation. Initialization for both cases is exactly the same. 



6 Discussion 



Despite reconstruction ambiguities that exist to orbital motion, we have shown, through synthetic 
and real scenes, that reasonable quasi-Euclidean reconstruction can be done. This is because in 
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(d) (e) (f) 

Figure 4: Rotating film box (complete rotation, 31 frames): (a) 1st frame, (b) 5th frame, (c) point 
tracks, (d)-(f) top, front, and side views of recovered 3-D points (1414 points). 



practice, the camera intrinsic parameters of principal point {uo, vo), image skew a, and aspect ratio 
r do not usually deviate significantly from normally assumed values of (0, 0), 0, and 1 respec- 
tively. We operate based on our preference of maximizing knowledge of the camera to directly 
reconstructing 3-D shape, rather than converting to intermediate projective or affine representa- 
tions. 

Results of experiments with simulated tracks having varying degrees of image feature location 
noise and actual camera tilt (relative to the vertical axis) were generally expected, with some 
surprises. While errors in estimated reconstructed shape, camera tilt, and focal length increased 
with actual camera tilt, the rate of increase in these errors is unexpected. The rate of increase in 
error is generally gradual up until angles close to 90°, when the rate suddenly shoots up. This 
suggests that any reasonable tilt angle is acceptable without significant degradation in the fidelity 
of 3-D reconstruction and motion recovery. Another surprise is the insensitivity of errors (based on 
their apparent randomness in the results) in estimated local rotation angles with both image noise 
and actual camera tilt. 
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6 DISCUSSION 




Figure 5: Rotating toy frog (complete rotation, 43 frames): (a) 1st frame, (b) 5th frame, (c) 2260 
point tracks, (d)-(f) top, front, and side views of recovered 3-D points, (g)-(i) corresponding views 
of the actual toy frog. 
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Figure 7: Rotating cube (not complete rotation): (a) First frame of the sequence, and (b) Top view 
of recovered points for the cube scene using 96 frames and known camera parameters. (From [25].) 
Notice the pinching effect. 
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(a) (b) 

Figure 8: Rotating synthetic cube: (a) Reconstruction from complete rotation, (b) Reconstruction 
from incomplete rotation. 

Finally, not surprisingly, results for experiments with the assumption that the object rotation (or 
equivalently, camera orbital motion) spans a full 360° are better than those with no such assump- 
tion. The better results are due to the addition constraints available in connection with the point 
tracks that bridges the first and last frames of the (circular) image sequence. One can anticipate 
that in the general case, structure and motion recovery from arbitrary but "closed loop" motion of 
the camera would be more accurate than if there is no assumption of direct connectedness between 
the first and last frames of the image sequence. 



7 Summary 

We have described a completely automatic method of recovering quasi-Euclidean structure from 
unknown by complete object rotation (or equivalently, camera orbital motion). This method 
starts with two-stage bidirectional tracking, followed by the application of iterative Levenberg- 
Marquardt minimization of feature point error to recover structure and rotational motion simulta- 
neously. Apart from the knowledge that the object was rotated completely about a unique axis, no 
other camera parameters are assumed known. 

This technique directly locally minimizes the error between the projected image feature posi- 
tion and measured feature position, and no intermediate affine or projective reconstruction is done. 
This technique usually converges toward the correct solution due to the assumption of complete 
object rotation, which makes simplifies initialization. In addition, because we know the motion is 
that of rotation, we can always make a good initial estimate of the camera tilt by fitting ellipses on 
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the recovered tracks. 

Simulations have indicated the surprising result that the recovered errors in shape, tilt, and 
focal length exhibit a very sharp reciprocal relationship with respect to the actual camera tilt. The 
sharpness is attenuated by 2-D feature location (gaussian) noise. Another interesting result of the 
same simulations is the relative indifference of errors in recovered local rotation angles to both 
actual camera tilt and noise. Applying the technique on image sequences of real rotated objects 
has yielded very reasonable-looking results of recovered 3-D object feature distribution. 
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